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The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 
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Processed by STIC: 





"%f®THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. ^ . : ;. 

^/PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: :-t' r J . 
^lV -MCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNIC:AT|ON TO THE 
^^^I^LICANT, WITH A NOTICE TO COMPLY or, . " • W " , ' . 

Sm^^mEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT; WITH A 
-N6itCE TO -COMPLY 

.^^%F^R jCRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-4212. 



FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 
>PA^KflN 2.1 e-mail help: patin21help@uspto.2Qv or phone 703-306-4119 (R. Wax) 
|^P^SfffiNr3;0fe«iail help: patin3help(a),usDto.gov or phone^Q^QMH^^^i^fs^i 




TO^DUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 3.0 PROGRAM . ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: o r 



. Checker Version 3.0 

The Checker Version 3:0 application is a state-of the-art Windows based software program 
employing a logical and intuitive user-interface to check whether a sequence listing is in 
compliance with format and content rules. Checker Version 3.0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October 1, 1990 (old 
rules) and the revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST.25 . 

.Checker Version 3.0 replaces the previous DOS-based version of Checker, and is Y2K- 
compliant. Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTOY 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 



Checker Version 3.0 can be down loaded from the USPTO website at the following address: 

http://www.uspto.gov/web/offices/pac/checker 
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TIME: 09:19:35 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 



E-- 
C-- 

C-- 



C--> 



SEQUENCE LISTING 
(1) GENERAL INFORMATION: 

(iii) NUMBER OF SEQUENCES : (l) (Q 
(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US/09/765, 061A 

(B) FILING DATE: 17-Jan-2001 



ERRORED SEQUENCES 



C--> 



4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 



(2) INFORMATION FOR SEQ ID NO: 1 

(i) SEQUENCE CHAR^CTEEXfiXlCS 

(A) LENGTH :^749 bases" 

(B) TYPE: nucleic aci 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) human 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 gene 

(B) LOCATION: 17pl3 . 1 

(D) OTHER INFORMATION: produces aryl- hydrocarbon 
receptor interacting protein- like 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 
GGCCTCCCAA AGTGCTGGAT TACAGGCGTG AGTCACCGCG CCTGGTCCCC TGTCTTCTTT 
AAGAAAGCTC AGCGGACCTT TTTCCTTCTT GGGGTGGAAC AAAAAGCCAA ATCTAGCACA 
ACCCTGGGCA GGGGCCCAGA ATCACTGGAA GCAAAGGTGG ATGGGATAGG AGGCGAGGCT 
GCCTGTGGAC CACAGGCCCG GCCCGAGTGG CTCTGATGAG AAGCCGGGGC GCCTAGGTCA 
CCGCCCCCAC' CGTCTGCCCT TCCCCCCACT CCTCCTGGCT GGGTAAATCC CAGAGTCTCA 
GCCGCCTAAG TGTCTTCCCC GGAGGTGAGA TTATCTCCGC CTGTGCTGGA CACCTCCCTT 
TCTCCTGCAG CCATGGATGC CGCTCTGCTC CTGAACGTGG AAGGGGTCAA GAAAACCATT 
CTGCACGGGG GCACGGGCGA GCTCCCAAAC TTCATCACCG GATCCCGAGT GAGTGGGGCC 
CCTCCGGAGC AGACAGGGTC CCCCACAGCA GCTTTCAACA TTCCAGGTGT 
ACTGTAAACA GCTTTCAGCT GTGCCAAAAA AACAGCCAGG CAGCCCCAGC 
CGGGGAGCTC CCAGCGTTTA CCCATTCAGG 
TTAGCATGGG CTGAGGGGAA GGGCTTTTGG 
GAAGAAAGGG AGTCCGAGGA GTCTTGGTAT 
GACTGAAGGG TGCGTCTGTG GCTACAGAAT 



Does Not Comply 
Corrected Diskette Needed 



GCCCCAAGGC 
GCTGGGCCTC 
AGATTCAACT 
TGTTGAGTGA 



GGGCATTTTT GGTACTTTGC 
GAATTTTCTG GGGCCCTAAA 
TTGTCCCCAA ATGTCTGTTA GGCTTCCCTG 
TCGGGCTTTG GCCAGGCGAG GCGGCTCCCG 



CCTGTAATCC CAGCACTTTG GGAGGCCAAG ATGGGCAGAT CATGAGGTCA AGAGTTCGAG 
ACCAGCCTGA CCAACATGTG AAACCCCATC TCTACTGAAA ATACAAAAAT TAGCCAGATG 
TGCTGTGGCG CCTGTAATCC CAGTTCAGAT ACTCAGGAGA CTTGAGGCAG GAGAATCACT 
TGAGCCCAGG AGGTGGAGGT TGCAGTGAGC CGAGATCATA CCACTGCACT CCAACCTGGG 
CAACAGAGTG AGACTCTGTC TCAGAAAAAA AAAAAAAAAA AAGAACTCGG GCTTACTTGA 
GGAAGGATTT CTGGACGCAC AGGGCTGTGG GGAGTGGAAT GGGGTCTGTA GGGAGGGGTG 
GGTCCCTCCT CCCTGGGGGG TGCAGGCAGG GTGGAGGTGC TCCAGGGGTC TGAGGCATCT 
GATGGGGTGA ACTGAGTGAG CTGACCCTGG GGACAGCCCT GGGTGTCGGT GGCAAGGGGG 
TGGCTTCTGC CGGGCCTTGA ACAGTGTGTC TAGAGCAGAG TGCACCGTCT CGGTGACTAG 
GTGATCTTTC ATTTCCGCAC CATGAAATGT GATGAGGAGC GGACAGTCAT TGACGACAGT 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/765 , 061A 



DATE: 08/06/2001 
TIME: 09:19:35 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\l765061A.raw 



42 CGGCAGGTGG GCCAGCCCAT GCACATCATC ATCGGAAACA TGTTCAAGCT CGAGGTCTGG 1500 

43 GAGATCCTGC TTACCTCCAT GCGGGTGCAC GAGGTGGCCG AGTTCTGGTG CGACACCATC 1560 

44 GTAAGTAGGC CCTGCGCGCC TGTCTCCTGG GACTAGTCTT TTCTGGGCTC ACCCACCCGC 1620 
4 5 TTTGCGGGGC TGCTGTGTTT CGGGAAAGCT GGGACTCAAG CGAAGCTTTG CAAAGCCAGT 1680 
4 6 CCTGCAAACT TATTCCCCAC CGTGTGCATG TGAAGATGGA GGGAACAAGG GCTGGAAGGG 1740 
4 7 GTGACCCATG CTGTGGCTGG CTGGTGGGGA GCAGGGCTAT GACCAGCAGG AGTGAGCTGG 1800 

48 CCCACTTCAC AGTCCTCACA TCTGTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG 1860 

49 TGTGTGTGTG AGAGAGAGAG AGAGAGAGAG AGAGAGNNNN NNNNNNTAGC CTTAGGACTT 1920 

50 ATTGCAGAGA CCAACACCTA ACAATGTAAT CAGGCAGCCA GTGCAGGACA TAAATAAGTA 1980 

51 AGGCAGTGTG CTTTGGGCCA CAAAAGCACG CTCAGCTTGC TGGAAGCCAT GGGTGCCGAG 2040 

52 CTGGGGGCTG CTGAGTCAGG GCCAAAGGGG GCCCCTCCCT GCAGTAAGCT GGTTCTGGGG 2100 

53 CCTCTCCCTC CCTTGGTCCA GCTCTTAATC CCAACAGGCT CAACAGCCAT CTGCTTGTCT 2160 

54 CTTCCATAAA GAGGCAGAAG GCATTTCGGG CTAATCCCGG CCGGTGGGGC GGGCAGGGTG 2220 

55 ACCTCTGTCT CTGTGCTGGT GACCTGGAGG CAGAGCTGAA CTGCTGCATA GAGTTTCAGC 2280 

56 CCCTTCACTT CACATGTTGC ATGTGGGGCC AGTGCTGGGT CATCTCAGAA GCCGGTCCAA 2 34 0 

57 GGAGATGGGT TCTCAGGGAG CCTAGTTGGG GAAACTGAGG CCCAGCATAC ATACAGCAGG 2400 

58 CCTCGCTGAG GCCGCACGGC GGATCTTCCC AGCCCTCCTT CATCCCAAGG GTGGCAAACT 2460 

59 CAGCTCCCAT GCTGGCTGAA GCTGTGATGA GCCAGATCTA TATCTGCACC ATCTCATTTA 2520 

60 ATCCCTACAG CAGCCCTAAT ATCGAACAGG AGCAACCCAG GGAACTGAGT TTCAGAGAAG 2580 

61 TGCAGAGACC TGGGCTCACC GCTAACCTGC AGCACTGCCA GGACACCAAA GCGACTCTCT 2 640 

62 TGGACCCTGG AGTCCTGCTC CTTCTACTGC CCCACACTGC CCTTCCTGCG AGTCATAGGC 2700 

63 TTTGCAGAGG TCAGGGTTTC CCTGGGGCAG AGATGTGTTA CAGTGGACCA CAAGGGCCAG 2760 

64 AAGAGGCAGC CGGAGGCTAA CAGCATATGG CCTCTGGAGC CAGGTTTGAA TCCTGGCTGC 2820 

65 GTCATTTCCT AGCTGTGTGA CCTTAAGCAA GTTGCTTGCG TCTCTGGGCT GTAGTTTCCC 2880 

66 CATCCGTAAA ATGGGATAAT AGTGCCTGCC TTGAATTGTC ATAAGGATTG AAGGGGCTCA 2 940 

67 TAACAGTGTG AAGTGCTTTG CCTGGCACAC AGTTAACCAC AGTTAGTATG AGTGGCATAG 3000 
'68 TGAGGGAGCA GGATTCCTCC CAGGAGGGGC TCTGAGTGGA GGCCTTTTAT GGCCCACCTA 3060 

69 GCTCTGGGCA GGTAGCCTGG ATGCCATCCA TCCGTTTATC CCCACAGCAC ACGGGGGTCT 3120 

70 ACCCCATCCT RTCCCGGAGC CTGAGGCAGA TGGCCCAGGG CAAGGACCCC ACAGAGTGGC 3180 

71 ACGTGCACAC GTGCGGGCTG GCCAACATGT TCGCCTACCA CACGCTGGGC TACGAGGACC 3240 

72 TGGACGAGCT GCAGAAGGAG CCTCAGCCTC TGGTCTTTGT GATCGAGCTG CTGCAGGTGG 3300 

73 GGCTGGGGTT GGCAGGGCTG GAGGGCTGTG CCAGCACTGG AGAGGGACAG CGGGCATCAT 33 60 

74 GGGCACCCCC ACCCCACTGG CCACTGGACA GTGCCCTGTT TCTGTTTAGA TAATACGAGA 3420 

75 GGGTTCATAA GCCATGGGAG AATACGAATT TGAAAAAAAA GTCCTCTGAT TTTTCCACAA 3480 

76 GAAAAGTCCT TTGGTGCTGG GCATGGTGGC CCACGCCTGT AATCCTAGCA CTTTGGGAGG 3 540 

77 CCGAGGGGGT TGGATCACCT GAGGTCAGGA GTTCGAAGAC CAGCCTGGCC AACATGGTAA 3600 

78 AACCCCGTCT CTATTAAAAA CACAAAAATT AACCGGGTGT GGTGGTGCAT GCCTGTAATC 3660 

79 AATCCCAGCT ACTTGGGAAT TTGAGGCATG AGAATTGCTT GAACCTGGAA GTGGAGGTTG 3720 

80 CAGTGAGCAG AGATCATGTC AGTGCATTTT AACCTGGGTG ACAGAGTGAG ACTCCATGTC 3780 

81 CAAAAAAAAG AAAAAAAAAA AAAGTCCACT TGGAACCAGT TTTTAAAAAT GTGATTCATT 3 840 

82 TTCATTGTGG AGGCATTTTA TCCACTTCCA CTTTCATTTT CAGGAGTTGG AGATTATAAC 3 900 

83 CGCCTCCTTG GTTCCTGTGG TTTGTGGGTT CAGACTTGGT TCTCTNGTGG CGGGAGAGGC 3960 

84 TGCATGGAAC TCCCCACATC CTCCCAACCA GGAGCCCCAG AGTGATTGGC AGCGCGTGTT 4020 

85 TGTGGATTGG TGAGAGAGGG TTAGGGCCAG GGTCAAGGTC AGGTCAGGAC TCAGCTTATG 4080 

86 GCCAAGACTG AGGCTCAGCC TGAGAGCTAT GTGGGTGAAT AAAATAAAAT AAGAACTGTG 4140 

87 TCAACCAAGG GCCCCTTACA GGCTTGCTGT CACAGTTGTG TGGTCTGTGC ACTGCACAAG 4 200 

88 GTGCACCGGC ATCTCCTCCA AGGTGCTCAT TATAGACATT GTATATTGGT ATTTCCATAA 4260 

89 TGAGAAGTTT CCAGCAGATG GCAATAGTGT ATTGTTCTAA CAAAACGAGT ATTCGTGACA 4 320 

90 ATTTTCTGAA TATTAGAAGT GAAGTGTCTT GATGAACGGG CACCTTTTCC TAGTTTGCAC 4380 
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91 AAAGACATTG ATTTAGGGCA GGGTTTTCGG CGTTGTTGCT TCTTTCCCTT GTCTGTATGC 4440 

92 ACTTGACCAG CAAGCATGAC TTCAGGGAGA TGTGCCACAG GGTCCTGTTT TTCGGGTCTC 4500 

93 TGATGGGGTG CAGGCCCCTG GGGTCCCTGC CTCACTGACC TGCAGCTCTG GGGCCAGGTT 4560 

94 GATGCCCCGA GTGATTACCA GAGGGAGACC TGGAACCTGA GCAATCATGA GAAGATGAAG 4 620 

95 GCGGTGCCCG TCCTCCACGG AGAGGGAAAT CGGCTCTTCA AGCTGGGCCG CTACGAGGAG 4 680 

96 GCCTCTTCCA AGTACCAGGA GGCCATCATC TGCCTAAGGA ACCTGCAGAC CAAGGTCAGA 474 0 

97 GGCCGCTGGC CAGGGGTGGG AAGTGGCGCT GACTCTGGGG GGCCTGCCCA GTGCCGGCCA 4 800 

98 GGGTGGGGCG GGGGTTGGGC AGCTGCCTGA GGTCATGGCT GACCTTCTCC CTGGGCAGGA 4 860 

99 GAAGCCATGG GAGGTGCAGT GGCTGAAGCT GGAGAAGATG ATCAATACTC TGATCCTCAA 4920 

100 CTACTGCCAG TGCCTGCTGA AGAAGGAGGA GTACTATGAG GTGCTGGAGC ACACCAGTGA 4 980 

101 TATTCTCCGG CACCACCCAG GTGCGCGGGG CTGCAGGGGC GGACAGTGAG GGGGCGCCCA 5040 

102 GCCCAGGGCC ACGGAGACAC CTGCCATAGC CTTCCTGGAC TTTTCTTTCC ACCCCACCAG 5100 

103 GGCACCAAAC CTTGTCTCCA CCCAGCCGGG TTTCCCCGAG TGTGTAACTG AATTGTGGGT 5160 

104 GATGGATGGG CAGTGCTTGG CGCGGGGCGG CCTTTATTTT AATGTGTGTT TGAACACTTA 5220 

105 CCCAGGAAGC TCGCCAAGCT TGTGATTTCA GCGGAACGGT AAACAGGCGT TTAAAAAGAG 52 80 

106 GGGCAATCAA TATAGGGAAA AATATTATGA TGTCGGTACT AGTACTGGTG TTGCGAGGAT 5340 

107 ATGGCACCGC AGTACTAGAT TGACTTAATG CTCGAATCGT GCTCACAGTA AAAACATCCA 54 00 

108 GCCCCTGGCT CATGCATCAG GCACACGTCG TCTGCGTTTA TTATCTCATT TAATCCTCAT 54 60 

109 AATCCTCATA ATCACCATAT GAGGGAGGTG CAGGGAAAGG GGCCTGAAGG TTATCTAATT 5520 

110 TAGGTAGCGT CTATAAGAAA AATAAAACAA AGTTATGAAT ATAAAATTAC TCACAGGGCC 5580 

111 TTAAAAAGGA GAGGAGGAGG TACTGCTATT ATGATCATCA TCTCCATCTT ACAGTTGAGG 5640 

112 AAACCGAGGG ATGGGGGATA CAGAGAGGTT AAGGATCATG GCGGGGCTGA GGGTCTTGGA 5700 

113 GGCTGGTGAG TCCCAGCTGG GCTGGGGCTG CCTCTGAGGC TGGGAAGGGA GCTGTAGCTG 5760 

114 GATGCTCCCT GCTCCCCACA GGCATCGTGA AGGCCTACTA CGTGCGTGCC CGGGCTCACG 5820 

115 CAGAGGTGTG GAATGAGGCC GAGGCCAAGG CGGACCTCCA GAAAGTGCTG GAGCTGGAGC 5880 

116 CGTCCATGCA GAAGGCGGTG CGCAGGGAGC TTGAGGCTGC TGGAGAACCG CATGGCGGAG 5940 

117 AACAGGAGGA GGAGCGGCTG CGCTGCCGGA ACATGCTGAG CCAGGGTGCC ACGCAGCCTC 6000 

118 CCGCAGAGCC ACCCACAGAG CCACCCGCAC AGTCATCCAC AGAGCCACCT GCAGAGCCAC 6060 

119 CCACAGCACC ATCTGCAGAG CTGTCCGCAG GGCCCCCTGC AGAGCCAGCC ACAGAGCCAC 6120 

120 CCCCGTCCCC AGGGCACTCG CTGCAGCACT GAGCCCCCTG AGGCCCACAG CCACCCAGGC 6180 

121 AGGGAGCAAG TGGCCTGGTC ACTTCTGGTT CGATTGACCA GGATCGTGGT GTCACTTTTT 6240 

122 AAAATTTAAA ATTAATTTTT GAAATCAAAG TCAGACACAC CCATGGTAAA AAAAAAAAAA 6300 

123 AAAACAATCC CAAGGGTACA GAAGAGCTTA TGAATAAAAG TAGTTTTCTC CTCTACCCCT 6360 

124 CTCATTCCTT CCGTGCCATG GTTTTAATTG ACCCTGTTTT TAATTCTTCT GGTAGTTTTC 6420 

125 TCTATTTCCA AGTAATCTGT TTAAATCAGT TTCTAGATTT TACCCCATGT CAATGACAAA 6480 

126 TGAGGATTTG ATGCTCTGAT CCTTTCTCAT GCCTGATACC CCTCCCTGTC TCCCCATTTT 6540 

127 GGATAGTTAC ATTTGGGGGT CATCTCGGTG ATTTTTGTAA CTTTACGCAG GACACTTAGA 6600 

128 GCTCTCTAGA ATCCCACTGA CTTTAGTGGG GTCTTGATGT AGGGTGGGCA AGCCCCGACA 6660 
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Output Set: N:\CRF3\08062001\I765061A.raw 

343 (D) OTHER INFORMATION: produces aryl -hydrocarbon 

344 receptor interacting protein- like 1 

345 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

C--> 347 ATGGATGCCG CTCTGCTCCT GAACGTGGAA GGGGTCAAGA AGACCATTCT GCACGGGGGC 60 

348 ACGGGCGAGC TCCCAAATTT CATCACCGGA TCCCGAGTGA TCTTTCATTT CCGCACCATG 120 

349 AAATGTGATG AGGAGCGGAC GGTGATTGAC GACAGCAGGG AGGTGGGCCA GCCCATGCAC 180 
3 50 ATCATCATCG GGAACATGTT CAAGCTGGAG GTCTGGGAGA TCCTGCTCAC GTCCATGCGG 240 

351 GTGCGAGAGG TGGCCGAGTT CTGGTGCGAC ACCATCCACA CGGGGGTCTA CCCCATCCTG 300 

352 TCCCGGAGCC TGCGGCAGAT GGCCCAGGGC AAGGACCCGA CGGAGTGGCA TGTGCACACG 360 

353 TGCGGGCTGG CCAACATGTT CGCCTACCAC ACGCTGGGCT ACGAGGACCT GGATGAGCTG 4 20 

354 CAGAAGGAGC CTCAGCCTCT GATCTTTGTG ATCGAGCTGC TGCAGGTTGA TGCCCCAAGT 4 80 

355 GATTACCAGA GGGAGACCTG GAACCTGAGC AATCACGAGA AGATGAAGGT GGTGCCCGTC 540 

356 CTCCATGGAG AAGGAAATAG GCTCTTCAAG CTGGGCCGCT ACGAGGAGGC CTCTTCCAAG 600 

357 TACCAGGAGG CCATCATCTG CCTAAGGAAC CTGCAGACCA AGGAGAAACC CTGGGAGGTG 660 

358 CAGTGGCTGA AGCTGGAGAA GATGATCAAT ACCCTGATCC TCAACTACTG TCAGTGTCTG 720 

359 CTGAAGAAGG AGGAGTACTA CG AGGTCCTG GAGCATACCA GTGACATTCT CCGGCACCAC 7 80 
3 60 CCAGGCATTG TGAAGGCCTA CTATGTGCGC GCCCGGGCTC ACGCGGAGGT GTGGAACGAG 840 
361 GCCGAGGCCA AGGCGGACCT CCAGAAAGTG CTGGAGCTGG AGCCGTCCAT GCAGAAGGCG 900 
3 62 GTGCGCAGGG AGCTGAGGCT GCTGGAGAAC CGCATGGCGG AGAAGCAGGA GGAGGAGCGG 960 
3 63 CTGCGCTGCC GCAACATGCT GAGCCAGGGG GCCACGTGGT CCCCCGCGGA GCCACCCGCA 1020 

3 64 GAGCCACCTG CAGAGTCATC CACAGAGCCA CCCGCAGAGC CACCTGCAGA GCCACCTGCA 108 
E--> 365 GAGCTAACCT TGACCCCGGG GCACCCACTA CAGCACTGA ^Tl29 

383 (2) INFORMATION FOR SEQ ID NO: 10: 

384 (i) SEQUENCE CHARACTERISTICS: 

385 (A) LENGTH: 15 bases 

386 (B) TYPE: nucleic acid 

387 (C) STRANDEDNESS : single 

388 (D) TOPOLOGY: linear 

389 (ii) MOLECULE TYPE: DNA (genomic) 

390 (ix) FEATURE: 

391 (A) NAME/KEY: AIPL1 Trp88X mutation 

392 (B) LOCATION: 86... 90 

393 (D) OTHER INFORMATION: 

394 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
C--> 396 GAG TTC J^A TGC GAC 15 

E-> 397 QO A^L 

447 (2) INFORMATION FOR SEQ ID NO: 14: 

448 (i) SEQUENCE CHARACTERISTICS: 

449 (A) LENGTH: 15 bases 

450 (B) TYPE: nucleic acid 

451 (C) STRANDEDNESS: single 

452 (D) TOPOLOGY: linear 

4 53 (ii) MOLECULE TYPE: DNA (genomic) 
4 54 (ix) FEATURE: 

455 (A) NAME/KEY: AIPL1 Glnl63X mutation 

456 (B) LOCATION: 161... 165 

457 (D) OTHER INFORMATION: 

458 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
C--> 460 GAT TAC TAG AGG GAG 15 
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E--> 461 M<3 )(fc<h 

479 (2) INFORMATION FOR SECTID NO: 16: 

480 (i) SEQUENCE CHARACTERISTICS: 

481 (A) LENGTH: 15 bases 

482 (B) TYPE: nucleic acid 

483 (C) STRANDEDNESS : single 

484 (D) TOPOLOGY: linear 

485 (ii) MOLECULE TYPE: DNA (genomic) 

486 (ix) FEATURE: 

487 (A) NAME/KEY: AIPL1 Trp278X mutation 

488 (B) LOCATION: 276... 280 

489 (D) OTHER INFORMATION: 

490 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

C--> 492 GAG GTG TGA AAT GAG 15 

E-> 493 ff)X0«K 



495 (2) INFORMATION FOR SEQ ID NO: 17: 

496 (i) SEQUENCE CHARACTERISTICS: 

497 (A) LENGTH: 15 bases 

498 (B) TYPE: nucleic acid 

499 (C) STRANDEDNESS: single 

500 (D) TOPOLOGY: linear 

501 (ii) MOLECULE TYPE: DNA (genomic) 

502 (ix) FEATURE: 

503 (A) NAME/KEY: AIPL1 IVS2-2A to G mutation 

504 (B) LOCATION: 

505 (D). OTHER INFORMATION: 

506 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

C--> 508 TCC CCA C GG Cfl C.,-ACG 15 

E--> 509 <JVS2-SA-> G^ ^jhjJU 

543 (2) INFORMATION FOR SEQ ID NO: 20: 

544 (i) SEQUENCE CHARACTERISTICS: 

545 (A) LENGTH: 13 bases 

546 (B) TYPE: nucleic acid 

547 (C) STRANDEDNESS: single 

548 (D) TOPOLOGY: linear 

549 (ii) MOLECULE TYPE: DNA (genomic) 

550 (ix) FEATURE: 

551 (A) NAME/KEY: AIPL1 Pro351dell2 mutation 

552 (B) LOCATION: Pro351 

553 (D) OTHER INFORMATION: TGCAGAGCCACC deleted 

554 sequence 

555 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

C--> 557 G CCA CCC ACA GCA 13 

E--> 558 ^ del a?e€AOAGCCACC 1 $J$ij0u 

576 (2) INFORMATION FOR SEQ ID NO: 22: 

577 (i) SEQUENCE CHARACTERISTICS: 

578 (A) LENGTH: 13 bases 

579 (B) TYPE: nucleic acid 

580 (C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 Ala336del2 mutation 

(B) LOCATION: Ala336 2 base deletion 

(D) OTHER INFORMATION: AG deleted sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

T CCC GCA GCC ACC a a * 13 



INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 Cys42X mutation 

(B) LOCATION: 40. . .44 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 23: 

ATG AAA TGA GAT GAG 15 

INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 Leu257del9 mutation 

(B) LOCATION: Leu 257 9 base deletion 

(D) OTHER INFORMATION: CTCCGGCAC deleted sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GAT ATT CAC CCA 12 
, doI^CTCCGGCAC ^ 
INFORMATION FOR SEQ ID NO: 25: 
(i) SEQUENCE CHARACTERISTICS: * 

(A) LENGTH: 21 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: AIPLl Val33ins8 mutation 

(B) LOCATION: Val 33 8 base insertion 

(D) OTHER INFORMATION: GTGATCTT inserted sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
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C--> 637 GAC TAG GTG ATC TTG TGA TCT 21 

E--> 638 - QTGATCTT ^jJU^' 

640 (2) INFORMATION FOR SEQ ID NO : 26: 

641 (i) SEQUENCE CHARACTERISTICS: 

642 (A) LENGTH: 12 bases 

643 (B) TYPE: nucleic acid 

644 (C) STRANDEDNESS : single 

645 (D) TOPOLOGY: linear 

64 6 (ii) MOLECULE TYPE: DNA (genomic) 

647 (ix) FEATURE: 

648 (A) NAME/KEY: AIPL1 IVS1-9G to A Benign 
64 9 Variants/Polymorphisms 

650 (B) LOCATION: 

651 (D) OTHER INFORMATION: 

652 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

C--> 654 CTC AGT GAC TAG 12 

E--> 655 ^ G-> A ^ 

657 (2) INFORMATION FOR SEQ ID NO: 27: 

658 (i) SEQUENCE CHARACTERISTICS: 

659 (A) LENGTH: 12 bases 

660 (B) TYPE: nucleic acid 

661 (C) STRANDEDNESS: single 

662 (D) TOPOLOGY: linear 

663 (ii) MOLECULE TYPE: DNA (genomic) 

664 (ix) FEATURE: 

665 (A) NAME/KEY: AIPL1 IVS2+66G to C Benign 

666 Variants/Polymorphisms 

667 (B) LOCATION: 

668 (D) OTHER INFORMATION: 

669 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

C--> 671 TTT GCC GGG CTG 12 

E--> 672 s-S-HK-G-**^ 

674 (2) INFORMATION FOR SEQ ID NO: 28: 

675 (i) SEQUENCE CHARACTERISTICS: 
. 676 (A) LENGTH: 12 bases 

677 (B) TYPE: nucleic acid 

678 (C) STRANDEDNESS: single 

679 (D) TOPOLOGY: linear 

680 (ii) MOLECULE TYPE: DNA (genomic) 

681 (ix) FEATURE: 

682 (A) NAME/KEY: AIPL1 IVS2-88C to T Benign 

683 Variants/Polymorphisms 

684 (B) LOCATION: 

685 (D) OTHER INFORMATION: 

686 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

C--> 688 TCC TCT CAG GAG 12 

E--> 689 tG ■ T ^ 

691 (2) INFORMATION FOR SEQ ID NO : 29: 

692 (i) SEQUENCE CHARACTERISTICS: 
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693 (A) LENGTH: 12 bases 

694 (B) TYPE: nucleic acid 

695 (C) STRANDEDNESS : single 

696 (D) TOPOLOGY: linear 

697 (ii) MOLECULE TYPE: DNA (genomic) 

698 (ix) FEATURE: 

699 (A) NAME/KEY: AIPL1 IVS2-14G to A Benign 

700 Variants/Polymorphisms 

701 (B) LOCATION: 

702 (D) OTHER INFORMATION: 

703 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

C--> 705 ATC CAT TTA TCC 12 

E--> 706 <■ € ) A * 

708 (2) INFORMATION FOR SEQ ID NO: 30: 

709 (i) SEQUENCE CHARACTERISTICS: 

710 (A) LENGTH: 12 bases 

711 (B) TYPE: nucleic acid 

712 (C) STRANDEDNESS: single 

713 (D) TOPOLOGY: linear 

714 (ii) MOLECULE TYPE: DNA (genomic) 

715 (ix) FEATURE: 

716 (A) NAME/KEY: AIPL1 IVS2-10A to C Benign 

717 Variants/Polymorphisms 

718 (B) LOCATION: 

719 (D) OTHER INFORMATION: 

720 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

C--> 722 CGT TTC TCC CCA 12 

E--> 723 < A > C 

725 (2) INFORMATION FOR SEQ ID NO: 31: 

726 (i) SEQUENCE CHARACTERISTICS: 

727 (A) LENGTH: 12 bases 

728 (B) TYPE: nucleic acid 

729 (C) STRANDEDNESS: single 

730 (D) TOPOLOGY: linear 

731 (ii) MOLECULE TYPE: DNA (genomic) 

732 (ix) FEATURE: 

733 (A) NAME/KEY: AIPL1 IVS3-25T to C Benign 

734 Variants/Polymorphisms 

735 (B) LOCATION: 

736 (D) OTHER INFORMATION: 

737 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

C--> 739 CTG CCC CAC TGA 12 

E--> 740 - T-> C -S-* 

742 (2) INFORMATION FOR SEQ ID NO: 32: 

74 3 (i) SEQUENCE CHARACTERISTICS: 

744 (A) LENGTH: 12 bases 

745 (B) TYPE: nucleic acid 

746 (C) STRANDEDNESS: single 

747 (D) TOPOLOGY: linear 
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748 
749 
- 750 
751 
752 
753 
754 

C--> 756 

E--> 757 
759 
760 
761 
762 
763 
764 
765 
766 
767 
768 
769 
770 
771 

C--> 773 

E--> 774 
912 
913 
914 
915 
916 
917 

W--> 918 
919 
920 
921 
922 
923 

E--> 925 
927 
928 
929 
930 
931 
932 

W--> 933 
934 
935 
936 
937 
938 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 IVS3-21T to C Benign 
Variants/Polymorphisms 

(B) LOCATION: 

(D) OTHER INFORMATION: 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CCT CAC CGA CCT 



(2) INFORMATION FOR SEQ ID NO: 33: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 IVS5+18G to A Benign 
Variants/Polymorphisms 

(B) LOCATION: 

(D) OTHER INFORMATION: 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AGG AGC GGA CAG 



(2) INFORMATION FOR SEQ ID NO : 42: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
MOLECULE TYPE: 
FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

(D) OTHER INFORMATION: 5 1 to 3 
SEQUENCEDESCRIPTION: SEQ ID Ng 
^^^G^A^qCATTCTjGCACt^^ 
( 2 ) INFORMAT IOnFor SEQ ' ID NO : 4*3 : 



12 



12 



(ii) 

(ix) 



(xi) 



DNA Primer 



AIPL1 primer 




(i) 



(ii) 

(ix) 



(xi) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
MOLECULE TYPE: DNA Primer 
FEATURE : 

(A) NAME/KEY: AIPL1 primer 

(B) LOCATION: 

(D) OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



/oh, ^M.^ 
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E- 


-> 


940 








942 


(2) 






943 








944 








945 








946 








947 




W- 


-> 


948 








949 








950 








951 








952 








953 




E- 


-> 


955 








957 


(2) 






958 








959 








960 








961 








962 








963 








964 








965 








966 








967 








968 




E- 


-> 


970 








972 


(2) 






973 








974 








975 








976 








977 








978 








979 








980 








981 








982 








983 




E- 


-> 


985 








987 


(2) 






988 








989 








990 








991 








992 








993 








994 








995 





RAW SEQUENCE LISTING DATE: 08/06/2001 

PATENT APPLICATION: US/09/765, 061A TIME: 09:19:35 

Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 

^^^CAGCTCGTCCAGGTCC'I^/ ^^2^0^ JA^^ ^ 
INFORMATION^Wr SEq'iD NO: 44': ' 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 bases 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: Primer DNA 
(ix) FEATURE: 

(A) NAME/KEY: AIPLl primer 

(B) LOCATION: 

(D) OTHER INFORMATION: 
(xi) SEQUENCE DESCRIPTION :^Sfi$-iANO: 44: ^^C^ 

C^G^^C^TCCCTTTCTCCr^) /O^^ 7 ^^ 17 

INFORMATION FOR^ SLU i i) N U \ 4 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Primer DNA (genomic) human 
(ix) FEATURE: 

(A) NAME/KEY: AIPLl primer 

(B) LOCATION: 

(D) OTHER INFORMATION: 
(xi) SEQUENCE DESCRIPTION: SEQ-Tr>-HO: 45: 

f5 s -GCTGGGGCTGCCTGGCTG- 3 "N 18 



INFORMATION FOlT" SEQ ID NO I 46 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Primer DNA (genomic) human 
(ix) FEATURE: 

(A) NAME/KEY: AIPLl Primer 

(B) LOCATION: 

(D) OTHER INFORMATION: 
(xi) SEQUENCE^ DESCRIPTION: SE£-TD KUr^6^ 



rENCE DESC1 
(^ jf^- CCG£( 
:ON~ FUK'SE^ 



-CCG^GTGATTACCAGAGGGA- 3 2 0 

INFORMATION " FUK * SEQ 1U NO : 4 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Primer DNA (genomic) human 
(ix) FEATURE: 

(A) NAME/KEY: AIPLl Primer 
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996 
997 
998 

E--> 1000 
1002 
1003 
1004 
1005 
1006 
1007 
1008 
1009 
1010 
1011 
1012 
1013 

E--> 1015 
1017 
1018 
1019 
1020 
1021 
1022 
1023 
1024 
1025 
1026 
1027 
1028 

E--> 1030 
1112 
1113 
1114 
1115 
1116 
1117 
1118 
1119 
1120 
1121 
1122 
1123 
1124 

E--> 1126 
1193 
1194 
1195 
1196 
1197 



RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/765 , 061A 



DATE: 08/06/2001 
TIME: 09:19:35 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 



(xi) 



(2) 



(B) LOCATION: 
(D) OTHER INFORMATION: 
SEQUENCE DESCRIPTION :^&ECTID N 6~- 
(^5* - TGAGCTCCAGCACCTCATAG - 3 

INFORMATION^TOft' CEQ ID NO*: 40': 

(i) SEQUENCE CHARACTERISTICS: 
• (A) LENGTH: 18 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



3 



20 



(ii) 
(ix) 



(xi) 



MOLECULE TYPE: Primer DNA (genomic) human 
FEATURE : 

(A) NAME/KEY: AIPL1 primer 

(B) LOCATION: 

(D) OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEP 



(2) 



^ £ ' -AqGCAG AGGTGTGGAATG 

ONTOk'SEQ'llJ NO? 




18 



(2) 



INFORMATION 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
MOLECULE TYPE: Primer DNA (genomic) human 
FEATURE : 

(A) NAME/KEY: AIPL1 Primer ' 

(B) LOCATION: 
(D) OTHER INFORMATION: 

SEQUENCE DESCRIPTION: SEQ ID 
C j5 ' - AftA AAG TG AC AC C ACG ATC - 
INFORMATION FOR^S Eu'lU NU: bb: 
(i) SEQUENCE CHARACTERIST] 

(A) LENGTH<^689 bases ^ 

(B) TYPE: nucleic 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 



(ii) 
(ix) 



(xi) 



19 



acid 
single 
linear 
cDNA 



(ii) MOLECULE TYPE: 
(ix) FEATURE: 

(A) NAME/KEY: AIPL1 gene exon/intron Acceptor 
splice site 

(B) LOCATION: 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTI0N-l_SE2_ 



(2) 



3NCE DESCRIPTION S SEP ID NO: 55: ^ 

<^ CACTGA CCT GCAGCTCTGGGGCCAG GTTGATGCCC J) 
)N i'Ok SEQ lb NO: 60: ' 



35 



INFORMATION FOR Sh'Q 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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1198 (D) TOPOLOGY: linear 

W--> 1199 (ii) MOLECULE TYPE: DNA Primer 

1200 (ix) FEATURE: 

1201 (A) NAME/KEY: AIPL1 gene Exon 1 Primer 

1202 (B) LOCATION: 240 

1203 (D) OTHER INFORMATION : 

1204 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ^jQ-i— ^ 

E--> 1206 ( 5 1 -GGAC A^CTCCCT TTCTCC ^JJ^^ 18 

1208 (2) INFORMATION. FOR tWO il/NU: bl: ' f "~ 

1209 (i) SEQUENCE CHARACTERISTICS: 

1210 (A) LENGTH: 18 bases 

1211 (B) TYPE: nucleic acid 

1212 (C) STRANDEDNESS : single 

1213 (D) TOPOLOGY: linear 
W--> 1214 (ii) MOLECULE TYPE: DNA Primer 

1215 (ix) FEATURE: 

1216 (A) NAME/KEY: AIPL1 gene Exon 1 Primer 

1217 (B) LOCATION: 240 

1218 (D) OTHER INFORMATION: 

1219 (xi) SEQUENCE DESCRIPTION: SEQ ID NO-^-fri-*^ 

E--> 1221 M5 1 -GCTGGGGCTGCCTGGCTG-rV_3 18 

122 3 (2) INFORMATION FOR SEQ- jD' NO, " ' ^ 

1224 (i) SEQUENCE CHARACTERISTICS: 

1225 (A) LENGTH: 20 bases 

1226 (B) TYPE: nucleic acid 

1227 (C) STRANDEDNESS: single 

1228 (D) TOPOLOGY: linear 
W--> 1229 (ii) MOLECULE TYPE: DNA Primer 

1230 (ix) FEATURE: 

1231 (A) NAME/KEY: AIPL1 gene Exon 2 Primer 

1232 (B) LOCATION: 297 

1233 (D) OTHER INFORMATION: 

1234 (xi) SEQUEJJ6E DESCRIPTION: SEQ ID NO: ~fr2 : 

E--> 1236 C l 1 -GGGCCTTGAACAGTGTGTCT-3 ^^ 20 

1238 (2) INFORMATIONFOR ' SEQ* ID NO: 63: 

1239 (i) SEQUENCE CHARACTERISTICS: 

1240 (A) LENGTH: 19 bases 

1241 (B) TYPE: nucleic acid 

1242 (C) STRANDEDNESS : single 

1243 (D) TOPOLOGY: linear 
W--> 1244 (ii) MOLECULE TYPE: DNA Primer 

124 5 (ix) FEATURE: 

1246 (A) NAME/KEY: AIPL1 gene Exon 2 Primer 

1247 (B) LOCATION: 297 

1248 (D) OTHER INFORMATION: 

1249 (xi) SEQUEJj£E DESCRIPTION: SEQ ID NUr -63,: 

E--> 1251 V§^raTCCCGAAACACAG 19 
1253 (2) INFORMATION ITOS" H>KQ 1JJ NO: 64: * 

12 54 (i) SEQUENCE CHARACTERISTICS: 
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■ 1255 (A) LENGTH: 18 bases 

1256 (B) TYPE: nucleic acid 

1257 (C) STRANDEDNESS : single 

1258 (D) TOPOLOGY: linear 
W--> 1259 (ii) MOLECULE TYPE: DNA Primer 

1260 (ix) FEATURE: 

1261 (A) NAME/KEY: AIPL1 gene Exon 3 Primer 

1262 .(B) LOCATION: 364 

12 63 (D) OTHER INFORMATION: 

1264 (xi) SEQUENCE DESCRIPTION: SEQ ID NO — 64 -^ 

E--> 1266 -AGT9AGGG AGCAGGATTC-3 '^4. 18 

1373 (2) INFORMATION FOR SEQ ID NO : 7 2 : " ' * . >) r . tU^nhraiX 

1374 ,i, SEQUENCE CHU^CTEKISTIC^^ ^ ^g&L SX«S?- 

1375 < A ) LENGTH: ^sF^TaTnlrlo^^ias SauSes for similar errors. 

1376 (B) TYPE: amino acid H 

1377 (D) TOPOLOGY: linear 

1378 (ii) MOLECULE TYPE: protein 

1379 (ix) FEATURE: 

1380 (A) NAME/KEY: Human Aipll 

1381 (B) LOCATION: 

1382 (D) OTHER INFORMATION: 

1383 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



1385 


Met 


Asp 


Ala 


Ala 


Leu 


Leu 


Leu 


Asn 


Val 


Glu 


Gly 


Val 


Lys 


Lys 


Thr 


1386 


1 








5 










10 










15 


1387 


He 


Leu 


His 


Gly 


Gly 


Thr 


Gly Glu 


Leu 


Pro 


Asn 


Phe 


He 


Thr 


Gly 


1388 










20 










25 










30 


1389 


Ser 


Arg 


Val 


He 


Phe 


His 


Phe 


Arg 


Thr 


Met 


Lys 


Cys 


Asp 


Glu 


Glu 


1390 










35 










40 










45 


1391 


Arg 


Thr 


Val 


He 


Asp 


Asp 


Ser 


Arg 


Gin 


Val 


Gly 


Gin 


Pro 


Met 


His 


1392 










50 










55 










60 


1393 


He 


He 


He 


Gly 


Asn 


Met 


Phe 


Lys 


Leu 


Glu 


Val 


Trp 


Glu 


He 


Leu 


1394 










65 










70 










75 


1395 


Leu 


Thr 


Ser 


Met 


Arg 


Val 


His 


Glu 


Val 


Ala 


Glu 


Phe 


Trp 


Cys 


Asp 


1396 










80 










85 










90 


1397 


Thr 


He 


His 


Thr 


Gly 


Val 


Tyr 


Pro 


He 


Leu 


Ser 


Arg 


Ser 


Leu 


Arg 


1398 










95 










100 










105 


1399 


Gin 


Met 


Ala 


Gin 


Gly 


Lys 


Asp 


Pro 


Thr 


Glu 


Trp 


His 


Val 


His 


Thr 


1400 










110 










115 










120 


1401 


Cys 


Gly 


Leu 


Ala 


Asn 


Met 


Phe 


Ala 


Tyr 


His 


Thr 


Leu 


Gly 


Tyr 


Glu 


1402 










125 










130 










135 


1403 


Asp 


Leu 


Asp 


Glu 


Leu 


Gin 


Lys 


Glu 


Pro 


Gin 


Pro 


Leu 


Val 


Phe 


Val 


1404 










140 










145 










150 


1405 


He 


Glu 


Leu 


Leu 


Gin 


Val 


Asp 


Ala 


Pro 


Ser 


Asp 


Tyr 


Gin 


Arg 


Glu 


1406 










155 










160 










165 


1407 


Thr 


Trp 


Asn 


Leu 


Ser 


Asn 


His 


Glu 


Lys 


Met 


Lys 


Ala 


Val 


Pro 


Val 


1408 










170 










175 










180 


1409 


Leu 


His 


Gly 


Glu 


Gly 


Asn 


Arg 


Leu 


Phe 


Lys 


Leu 


Gly 


Arg 


Tyr 


Glu 


1410 










185 










190 










195 


1411 


Glu 


Ala 


Ser 


Ser 


Lys 


Tyr 


Gin 


Glu 


Ala 


He 


He 


Cys 


Leu 


Arg 


Asn 



file://C:\CRF3\Outhol<r\VsrI765061A.htm 



8/6/01 



• 



Page 14 of 20 



RAW SEQUENCE LISTING DATE: 08/06/2001 

PATENT APPLICATION: US/09/765 , 061A TIME: 09:19:35 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 



1412 










200 








205 










210 


1413 


Leu 


Gin 


Thr 


Lys 


Glu 


Lys 


Pro Trp 


Glu 


Val 


Gin 


Trp 


Leu 


Lys 


Leu 


1414 










215 








220 










225 


1415 


Glu 


Lys 


Met 


He 


Asn 


Thr 


Leu He 


Leu 


Asn 


Tyr 


Cys 


Gin 


Cys 


Leu 


1416 










230 








235 










240 


1417 


Leu 


Lys 


Lys 


Glu 


Glu 


Tyr 


Tyr Glu 


Val 


Leu 


Glu 


His 


Thr 


Ser 


Asp 


1418 










245 








250 










255 


1419 


He 


Leu 


Arg 


His 


His 


Pro 


Gly He 


Val 


Lys 


Ala 


Tyr 


Tyr 


Val 


Arg 


1420 










260 








265 










270 


1421 


Ala 


Arg 


Ala 


His 


Ala 


Glu 


Val Trp 


Asn 


Glu 


Ala 


Glu 


Ala 


Lys 


Ala 


1422 










275 








280 










285 


1423 


Asp 


Leu 


Gin 


Lys 


Val 


Leu 


Glu Leu 


Glu 


Pro 


Ser 


Met 


Gin 


Lys 


Ala 


1424 










290 








295 










300 


1425 


Val 


Arg 


Arg 


Glu 


Leu 


Arg 


Leu Leu 


Glu 


Asn 


Arg 


Met 


Ala 


Glu 


Lys 


1426 










305 








310 










315 


E--> 1427 


Gin 


Glu 


Glu 


Glu 


Arg 


Leu 


fXxxl Cys 


Arg 


Asn 


Met 


Leu 


Ser 


Gin Gly 


1428 










320 








325 










330 


1429 


Ala 


Thr 


Gin 


Pro 


Pro 


Ala 


Glu Pro 


Pro 


Thr 


Glu 


Pro 


Pro 


Ala 


Gin 


1430 










335 








340 










345 


1431 


Ser 


Ser 


Thr 


Glu 


Pro 


Pro 


Ala Glu 


Pro 


Pro 


Thr 


Ala 


Pro 


Ser 


Ala 


1432 










350 








355 










360 


1433 


Glu 


Leu 


Ser 


Ala 


Gly 


Pro 


Pro Ala 


Glu 


Pro 


Ala 


Thr 


Glu 


Pro 


Pro 


1434 










365 








370 










375 


1435 


Pro 


Ser 


Pro 


Gly 


His 


Ser 


Leu Gin 


His 














E--> 1436 










380 


















1438 


(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 73: 






f 








1439 




(i) 


SEQUENCE CHARA<; 


^RISTI£ 
















1440 






(A) LENGTH: (3 69/amino 


acids 












1441 






(B) TYPE: 


amino acid 
















1442 






(D) TOPOLOGY: 


linear 
















1443 




(ii) 


MOLECULE TYPE: 


protein 
















1444 




(ix) 


FEATURE : 




















1445 






(A) NAME/KEY: 


Chimpansee Aipll 










1446 






(B) LOCATION: 


















1447 






(D) OTHER 


INFORMATION: 
















1448 




(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO : 73: 








1450 


Met 


Asp 


Ala 


Ala 


Leu 


Leu 


Leu Asn 


Val 


Glu 


Gly 


Val 


Lys 


Lys 


Thr 


1451 


1 








5 








10 










15 


1452 


He 


Leu 


His 


Gly 


Gly 


Thr 


Gly Glu 


Leu 


Pro 


Asn 


Phe 


He 


Thr Gly 


1453 










20 








25 










30 


1454 


Ser 


Arg 


Val 


He 


Phe 


His 


Phe Arg 


Thr 


Met 


Lys 


c ys 


Asp 


Glu 


Glu 


1455 










35 








40 










45 


1456 


Arg 


Thr 


val 


He 


Asp 


Asp 


Ser Arg 


Gin 


Val 


Gly 


Gin 


Pro 


Met 


His 


1457 










50 








55 










60 


1458 


He 


He 


He 


Gly 


Asn 


Met 


Phe Lys 


Leu 


Glu 


Val 


Trp 


Glu 


He 


Leu 


1459 










65 








70 










75 


1460 


Leu 


Thr 


Ser 


Met 


Arg 


Val 


His Glu 


Val 


Ala 


Glu 


Phe 


Trp 


Cys 


Asp 


1461 










80 








85 










90 


1462 


Thr 


He 


His 


Thr 


Gly 


Val 


Tyr Pro 


He 


Leu 


Ser 


Arg 


Ser 


Leu 


Arg 



file://C:\CRF3\OutholchVsrI765061A.htm 
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RAW SEQUENCE LISTING DATE: 08/06/2001 

PATENT APPLICATION: US/09/765 , 061A TIME: 09:19:35 

Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 



1463 










95 










100 










105 


1464 


Gin 


Met 


Ala 


Gin 


Gly 


Lys 


Asp 


Pro 


Thr 


Glu 


Trp 


His 


Val 


His 


Thr 


1465 










110 










115 










120 


1466 


Cys 


Gly Leu Ala 


Asn 


Met 


Phe 


Ala 


Tyr 


His 


Thr 


Leu 


Gly 


Tyr 


Glu 


1467 










125 










130 










135 


1468 


Asp 


Leu 


Asp 


Glu 


Leu 


Gin 


Lys 


Glu 


Pro 


Gin 


Pro 


Leu 


Val 


Phe 


Val 


1469 










140 










145 










150 


1470 


He 


Glu 


Leu 


Leu 


Gin 


Val 


Asp 


Ala 


Pro 


Ser 


Asp 


Tyr 


Gin 


Arg 


Glu 


1471 










155 










160 










165 


1472 


Thr 


Trp 


Asn 


Leu 


Ser 


Asn 


His 


Glu 


Lys 


Met 


Lys 


Ala 


Val 


Pro 


Val 


1473 










170 










175 










180 


1474 


Leu 


His 


Gly Glu 


Gly 


Asn 


Arg 


Leu 


Phe 


Lys 


Leu 


Gly 


Arg 


Tyr 


Glu 


1475 










185 










190 










195 


1476 


Glu 


Ala 


Ser 


Ser 


Lys 


Tyr 


Gin 


Glu 


Ala 


He 


He 


Cys 


Leu 


Arg 


Asn 


1477 










200 










205 










210 


1478 


Leu 


Gin 


Thr 


Lys 


Glu 


Lys 


Pro 


Trp 


Glu 


Val 


Gin 


Trp 


Leu 


Lys 


Leu 


1479 










215 










220 










225 


1480 


Glu 


Lys 


Met 


He 


Asn 


Thr 


Leu 


He 


Leu 


Asn 


Tyr 


Cys 


Gin 


Cys 


Leu 


1481 










230 










235 










240 


1482 


Leu 


Lys 


Lys 


Glu 


Glu 


Tyr 


Tyr 


Glu 


Val 


Leu 


Glu 


His 


Thr 


Ser 


Asp 


1483 










245 










250 










255 


1484 


He 


Leu 


Arg 


His 


His 


Pro 


Gly 


He 


Val 


Lys 


Ala 


Tyr 


Tyr 


Val 


Arg 


1485 










260 










265 










270 


1486 


Ala 


Arg 


Ala 


His 


Ala 


Glu 


Val 


Trp 


Asn 


Glu 


Ala 


Glu 


Ala 


Lys 


Ala 


1487 










275 










280 










285 


1488 


Asp 


Leu 


Arg 


Lys 


Val 


Leu 


Glu 


Leu 


Glu 


Pro 


Ser 


Met 


Gin 


Lys 


Ala 


1489 










290 










295 










300 


1490 


Val 


Arg 


Arg 


Glu 


Leu 


Arg 


Leu 


Leu 


Glu 


Asn 


Arg 


Met 


Ala 


Glu 


Lys 


1491 










305 










310 










315 


1492 


Gin 


Glu 


Glu 


Glu 


Arg 


Leu 


Arg 


Cys 


Arg 


Asn 


Met 


Leu 


Ser 


Gin 


Gly 


1493 










320 










325 










330 


1494 


Ala 


Thr 


Gin 


Pro 


Pro 


Ala 


Glu 


Pro 


Pro 


Thr 


Glu 


Pro 


Pro 


Ala 


Gin 


1495 










335 










340 










345 


1496 


Ser 


Ser 


Thr 


Glu 


Pro 


Pro 


Ala 


Glu 


Pro 


Pro 


Pro 


Ala 


Pro 


Ser 


Ala 


1497 










350 










355 










360 


1498 


Glu 


Leu 


Ser 


Ala 


Gly 


Pro 


Pro 


Ala 


Glu 


Thr 


Ala 


Thr 


Glu 


Pro 


Pro 


1499 










365 










370 










375 


1500 


Pro 


Ser 


Pro Gly 


His 


Ser 


Leu 


Gin 


His 














1501 










-365* 






























3$o 
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(3) Computer Apple Macintosh; 

(i) Operating System: Macintosh; 

(ii) Macintosh File Type: text with line 
termination 

(iii) Line TerralnatorrPre-defined by 
text type file; 

(iv) Pagination: Pre-defined by text 
type file; . 

(v) Endnof-nie: Pre-defined by text 
type file; 

(vi) Media: (A) Diskett-<J.50 Inch. 400 
Kb storage; 

(B) Diskette— 0.50 Inch. 600 Kb 
storage; 

(C) Diskette— X50 Inch. 1.4 Mb 
storage; 

(vii) Print Command: Use PRINT 
command from any Macintosh 
Application that processes text files, 
such as MacWrite or Teach Text; 

(4) Magnetic tape; 0.5 Inch, up to 2400 
feel; 

(i) Density: 1600 or 6250 bits per inch, 
9 track; 

(II) Format: raw, unblocked; 

(Iii) Line Terminator ASCII Carriage 
Return plus optional ASCII line Feed; 

(iv) Pagination: ASCII Form Feed or 
Series of Line Terminators; 

(v) Print Command (Unix shell version 
given here as sample response — mt/ 
dev/rmtO; lpr/der/rintO): 

(g) Computer readable forms that are 
submitted to the Office will not be 
returned to the applicant. 

(h) All computer readable forms shall 
have a label^rmariently affixed thereto 
ohvhlch has 4 been*tiaHd printed or 
typed, a description of the format of the 
computer readable form as well as the 
name of the applicant the title of the 
Invention, the date on which the data 
were recorded on the computer readable 
form and the name and type of Computer 
and operating system which generated 
the flies on the computer readable form. 
If all of this Information cannot be 
printed on a label affixed to the 
computer readable form, by reason of 
size or otherwise, the label shall Include 
the name of the applicant and the title of 
the Invention and a reference number, 
and the additional Information may be 
provided on a container for the 
computer readable form with the name 
of the applicant, the title of the 
Invention, the reference number and the 
additional Information affixed to the 
container. If the computer readable form 
is submitted after the date of filing 



under 35 U.S.C. 111. after the date of 
entry In the national stage under 35 
tLS.C. 371 or after the time of filing. In 
the United States Receiving Office, an 
International application under the PCT, 
the labels mentioned herein must also 
Include the date of the application and 
the application number, including series 
code and serial number. 

Aro*odm*nti to Of r»plfrc#m«nt oi 
**qo*oc« fttttog and computer r*«d&b4« 
copy thereof. 

(a) Any amendment to the paper copy 
of the "Sequence Listing" ({ 1.621(c)) 
must be made by the submission of 
substitute sheets. Amendments must be 
accompanied by a statement that 
Indicates support for the amendment In 
the application, as filed, and a statement 
that the substitute sheets Include no 
new matter. Such a statement must be a 
verified statement If made by a person 
not registered to practice before the 
Office. 

(b) Any amendment to the paper copy 
of the "Sequence Listing" In accordanco 
with paragraph (a) of this section, must 
be accompanied by a substitute copy of 
the computer readable form ({ 1.821(e)) 
Including all previously submitted data 
with the amendment incorporated 
therein, accompanied by a statement 
that the copy In computer readable form 
is the same as the substitute copy of the 
"Sequence Listing." Such a statement 
must be a verified statement if made by 
a person not registered to practice 
before the Office. 

(c) Any appropriate amendments to 
the "Sequence Listing" in a patent, e.g.. 
by reason of reissue or certificate of 
correction, must comply with the 
requirements of paragraphs (a) and (b) 
of this section. 

(d) If, upon receipt, the computer 
readable form Is found to be damaged or 
unreadable, applicant must provide, 
within such time as set by the 

■ Commissioner, a substitute copy of the 
data In computer readable form 
accompanied by a statement that the 
substitute data is Identical to that 
originally filed. Such a statement must 
be a verified statement if made by a 
person not registered to practice before 
i^OJc*^ \ ' — 

('Appendix A — Sample Sequence Listing 
(1) GENERAL INFORMATION: 




(I) APPLICANT: Do«. Joan X. Doe, John Q 

(II) TITLE OF DsTVENTION: laolatlon and 

CharacteriraUon of a One Encoding a 
Protease from Paramecium to. 

(III) NUMBER OF SEQUENCES: Z 

(iv) CORRESPONDENCE ADDRESS: 

(A) AD DRESS EE: 'Smith and Jooei 

(B) STREET: 123 Main Street 

(C) CITY: Srnalitbwn 

(D) STATE: Any t late* 

(E) COUNTRY: ySA - 

(v) COMPUTER READABLE FORM 

(A) MEDIUM TYPE: DUkettc. 3.50 inch. 600 

Kb storage 
fB] COMPUTER: Apple Macintosh 
(C) OPERATING SYSTEM: Mcintosh 5,0 
fD) SOFTWARE: MacWHle 

(vi) CURRENT APPLICATION DATA 

(A) APPLICATION NUMBER: 09/999.999 

(B) FILING DATE: 2a~FEB-19C9 

(C) CLASSIFICATION: 999/99 

(vii) PRJOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US68/ 
99999 

(B) FILING DATE: 01-MAR-1968 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: Smith, John A. 

(D) REGISTRATION NUMBER: 00001 

(C) REFERENCE/DOCKET NUMBER: 01- 
0001 

(ix) TELECOMMUNICATION 
INFORMATION: 

(A) TELEPHONE: (909) 999-0001 

(DJ TELEFAX: (909) 999-0002 

(2) INFORMATION FOR SEQ fD NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 954 bate pain 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS: ilngle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DN A 
(ill) HYPOTHETICAL: yei 

(Iv) ANTI -SENSE: ho 

(vi) ORICINAL SOURCE: 

(A) ORCANISM: Paramecium sp 
(C) INDrVlDUAL/ISOlATK: XYZZ 
(G) CELL TYPE: unicellular organism 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: genomic 
fB) CLONE: Para-XYZ2/36 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Doe. Joan X Doe. John Q 

(B) TITLE: Isolation and Characleriration 
of a Gene Encoding r-Protease from 
Paramecium sp. 

(C) JOURNAL: Fictional Genes 

(D) VOLUME: I 

(E) ISSUE: 1 

(F) PACES: 1-20 

(C) DATE: 02-MAR-1988 
(K) RELEVANT RESIDUES IN SEQ ID NO: 
1: FROM 1 TO 954 

BILLING COO€ 3M»-1fr-M 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



ATCGGGATAG TACTGGTCAA GACCGGTGGA CACCGGTTAA CCCCGGTTAA GTACCGGTTA 60 

TAGGCCATTT CAGGCCAAAT GTGCCCAACT ACGCCAATTG TTTTGCCAAC GGGCAACGTT 120 

ACGTTCGTAC GCACGTATGT ACCTAGGTAC TTACGGACGT GACTACGGAC ACTTCCGTAC 180 

GTACGTACGT TTACGTACCC ATCCCAACGT AACCACAGTG TGGTCGCAGT GTCCCAGTGT' 240 

ACACAGACTG CCAGACATTC TTCACAGACA CCCC ATG ACA CCA CCT GAA CGT CTC*" 295 

Met Thr Pro Pro Glu Arg Leu 
-30 

TTC CTC CCA AGG GTG-TGT GGC ACC ACC CTA CAC CTC CTC CTT CTG GGG 343 
Phe Leu Pro Arg Val Cys Gly Thr Thr Leu His Leu Leu Leu Leu Gly 
-25 -20 -15 

CTG CTG CTG GTT CTG CTG CCT GGG GCC CAT GTGAGGCAGC AGGAGAATGG 393 
Leu Leu Leu Val Leu Leu Pro Gly Ala His 
-10 -5 

GGTGGCTCAG CCAAACCTTG AGCCCTAGAG CCCCCCTCAA CTCTGTTCTC CTAG GGG 450 

Gly 



CTC ATG CAT CTT GCC CAC AGC AAC CTC AAA CCT GCT GCT CAC CTC ATT 498 
Leu Met His Leu Ala His Ser Asn Leu Lys Pro Ala Ala His Leu He 
15 10 15 

GTAAACATCC ACCTGACCTC CCAGACATGT CCCCACCAGC TCTCCTCCTA CCCCTGCCTC 558 

AGGAACCCAA GCATCCACCC CTCTCCCCCA ACTTCCCCCA CGCTAAAAAA AACAGAGGGA 618 

GCCCACTCCT ATGCCTCCCC CTGCCATCCC CCAGGAACTC AGTTGTTCAG TGCCCACTTC 678 

TAC CCC AGC AAG CAG AAC TCA CTG CTC TGG AGA GCA AAC ACG GAC CGT 726 
Tyr Pro Ser Lys Gin Asn Ser Leu Leu Trp Arg Ala Asn Thr Asp Arq 
20 25 30 

GCC TTC CTC CAG GAT GGT TTC TCC TTG AGC AAC AAT TCT CTC CTG GTC 774- 
Ala Phe Leu Gin Asp Gly Phe Ser Leu Ser Asn Asn Ser Leu Leu Val 
35 40 45 

TAGAAAAAAT AATTGATTTC AAGACCTTCT CCCCATTCTG CCTCCATTCT GACCATTTCA 834 

GGGGTCGTCA CCACCTCTCC TTTGGCCATT CCAACAGCTC AAGTCTTCCC TGATCAAGTC 894 

ACCGGAGCTT TCAAAGAAGG AATTCTAGGC ATCCCAGGGG ACCCACACCT CCCTGAACCA 954 

WLLINQ COOC 351»-1»-C 
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(2) INFORMATION F OR SE Q ID NO: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino aclda 

(B) TYPE: amino add 
(D) TOPOLOCYi linear 

(il) MOLECULE TYPE: protein 
(Ix) FEATURE: 

(A) NAME/KEY: signal sequence 

(B) LOCATION: -34 to -1 



(C) IDENTIFICATION METHOD: similarity 
to other signal sequences, hydrophobic 

(DJ OTHER INFORMATION: expresses 
protease 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Doe. Joan X. Doe. John Q 

(B) TITLE: Isolation and Characterization 
of a Gene Encoding a Protease from 
Paramecium sp.* 



(C) JOURNAL: Fictional Cerwst 
(DJ VOLUME: I 

(E) ISSUE: 1 

(F) PACES: 1-20 

(CJ DATE: 02-MAR-lWa . 
(K) RELEVANT RESIDUES IN SEQ ED NO: 
2: FROM -M TO 46 
mjjhq coot sett-ta-at 




c/ocf-c^ Of- 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2; 



Met Thr Pro Pro Glu Arg Leu Phe Leu Pro Arg Val Cys Gly Thr-Thr 
-30 -25 -20 

Leu His Leu Leu Leu Leu Gly Leu Leu Leu Val Leu Leu Pro Gly Ala 
-15 . -10 -5 

His Gly Leu Met His Leu Ala His Ser Asn Leu Lys Pro Ala Ala His 
1-5 10 

Leu He Tyr Pro Ser Lys Gin Asn Ser Leu Leu Trp Arg Ala A£n Thr 
15 20 25 30 

Asp Arg Ala Phe Leu Gin Asp Gly Phe Ser Leu Ser Asn Asn Ser Leu 
35 40 45 
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Leu Val 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/765 , 061A 



DATE: 08/06/2001 
TIME: 09:19:36 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 



L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
M: 
L: 
L: 
L: 
M: 
L: 
M: 
L: 
M: 
L: 
M: 
L: 
M: 
L: 
M: 
L: 
M: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 



0 M:200 E 

0 M:200 E 

0 M:200 E 

0 M:200 E 

0 M:200 E 

0 M:249 C 

0 M:249 C 

0 M:249 C 
18 M:lll C 



Mandatory Header Field missing, 
Mandatory Header Field missing, 
Mandatory Header Field missing, 
Mandatory Header Field missing, 
Mandatory Header Field missing, 
Inserted Mandatory Field, [(vi) CURRENT APPLICATION DATA:] 
Inserted Mandatory Field, [(A) APPLICATION NUMBER: ] 
Inserted Mandatory Field, [(B) FILING DATE:] 
: (47) String data converted to upper case, 



[(i) APPLICANT:] of (1) 
[(ii) TITLE OF INVENTION:] of (1) 
[(A) ADDRESSEE:] of (l)(iv) 
[ (B) STREET: ] of (1) (iv) 
[(C) CITY:] Of (l)(iv) 



111 Repeated in SeqNo=l 



129 M:254 E: No. of Bases conflict, Input:6749 Counted:6689 SEQ : 1 

129 M:204 E: No. of Bases differ, LENGTH : Input : 6749 Counted:6689 SEQ : 1 

145 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo=2 

179 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo=3 

214 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo=4 

248 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo=5 

279 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo^6 

312 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo=7 

347 M:lll C: (47) String data converted to upper case, 

111 Repeated in SeqNo=8 ^ 



/ 



365 M:254 E 

365 M:204 E 

380 M:lll 

396 M:lll 

397 M:254 
397 M:320 
397 M:204 



412 M:lll 

428 M:lll 

444 M:lll 

460 M:lll 

461 M:254 



461 M:320 E 

461 M:204 E 

476 M:lll C 

492 M:lll C 

493 M:254 E 
493 M:320 E 
493 M:204 E 

508 M:lll C 

509 M:334 W 
509 M:333 E 



No. of Bases conflict, Input: 1129 Counted: 1119 SEQ: 8 ^ 

No. of Bases differ, LENGTH : Input : 1129 Counted: 1119 SEQ: 8 

(47) String data converted to upper case, 

(47) String data converted to upper case, 

No. of Bases conflict, lnput:0 Counted:15 SEQ:10^ 

(1) Wrong Nucleic Acid Designator, 1 ^ 

No. of Bases differ, LENGTH : Input : 15 Counted: 16 SEQ: 10 

(47) String data converted to upper case, 

(47) String data converted to upper case, 

(47) String data converted to upper case, 

(47) String data converted to upper case, , 

No. of Bases conflict, Input :0 Counted: 15 SEQ: 14 

(1) Wrong Nucleic Acid Designator, 1 

No. of Bases differ, LENGTH : Input : 15 Counted: 16 SEQ: 14 

(47) String data converted to upper case, 

(47) String data converted to upper case, 

No. of Bases conflict, Input: 0 Counted: 15 SEQ: 16 

(1) Wrong Nucleic Acid Designator, 1* 

No. of Bases differ, LENGTH : Input : 15 Counted: 16 SEQ: 16 
(47) String data converted to upper case, 

(2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS 
Wrong sequence grouping, Amino acids not in groups! 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/765, 061A 



DATE: 08/06/2001 
TIME: 09:19:36 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 



L:524 
L:540 
L:557 
L:558 
L:558 
L:573 
L:589 
L:590 
L:590 
L:605 
L:606 
L:606 
L:606 
L:621 
L:622 
L:622 
L:637 
L:638 
L:638 
L:654 
L:655 
L:655 
L:671 
L:672 
L:672 
L:688 
L:689 
L:689 
L:705 
L:706 
L:706 
L:722 
L:723 
L:723 
L:739 
L:740 
L:740 
L:756 
L:757 
L:757 
L:773 
L:774 
L:774 
L:790 
L:807 
L:824 
L:841 
L:858 
L:875 



M:lll C 
M:lll C 
M:lll C 
M:334 W 
M:333 E 
M:lll C 
M:lll C 
M:334 W 
M:333 E 
M:lll 
M:254 
M:320 
M:204 
M:lll 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll 
M:334 
M:333 
M:lll 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll C 
M:334 W 
M:333 E 
M:lll 
M:lll 
M:lll 
M:lll 
M:lll 
M:lll 



47) String data converted to upper case, 
47) String data converted to upper case, 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! r 
47) String data converted to upper case, 

47) String data converted' to upper case, r 
2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 

Wrong sequence grouping, Amino acids not in groups! r- — 
47) String data converted to upper case, 

No. of Bases conflict, Input :0 Counted: 15 SEQ:23 

1) Wrong Nucleic Acid Designator, 1 <r~ ^ 

No. of Bases differ, LENGTH : Input : 15 Counted: 16 SEQ:23 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! 

47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! _ 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups!^ 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! S' 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! 

47) String data converted to upper case, ^" 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups lr 

47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! 
47) String data converted to upper case, 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! ^ 

47) String data converted to upper case, , 

2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 
Wrong sequence grouping, Amino acids not in groups! 

47) String data converted to upper case, 

47) String data converted to upper case, 

47) String data converted to upper case, 

47) String data converted to upper case, 

47) String data converted to upper case, 

47) String data converted to upper case, 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 



KEYS : 2 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/765 , 061A 



DATE: 08/06/2001 
TIME: 09:19:36 



L:892 

L:909 

L:918 

L:925 

L:925 

M:lll 

L: 925 

L: 933 

L:940 

L: 940 

L:940 

L:948 

L:955 

L: 955 

L:955 

L:970 

L:970 

L:970 

L:985 

L:985 

L: 985 

L:1000 

L:1000 

L:1000 

L:1015 

L:1015 

L:1015 

L:1030 

L:1030 

L:1030 

L:1126 

L:1126 

L:1199 

L : 1206 

L:1206 

L:1206 

L:1214 

L:1221 

L:1221 

L:1221 

L:1229 

L:1236 

L:1236 

L:1236 

L:1244 

L:1251 

L:1251 

L:1251 

L:1259 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I76506lA.raw 

(47) String data converted to upper case, 
(47) String data converted to upper case, 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=42 
No. of Bases conflict, Input: 20 Counted: 20 SEQ:42 
(1) Wrong Nucleic Acid Designator, 6 
in SeqNo=42 

No. of Bases differ, LENGTH: Input : 20 Counted:24 SEQ:42 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=43 
No. of Bases conflict, Input: 19 Counted: 19 SEQ:43 ^ 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH: Input : 19 Counted: 23 SEQ:43 ^ 
Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=44 
No. of Bases conflict, Input: 17 Counted: 17 SEQ:44 ^ 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH: Input : 17 Counted: 21 SEQ:44 
No. of Bases conflict, Input:18 Counted:18 SEQ:45 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH: Input : 18 Counted: 22 SEQ:45 - 
No. of Bases conflict, Input: 20 Counted: 20 SEQ:46 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH: Input: 20 Counted: 24 SEQ:46 
No. of Bases conflict, Input: 20 Counted: 20 SEQ:47 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH: Input: 20 Counted: 24 SEQ:47 ^ 
No. of Bases conflict, Input: 18 Counted: 18 SEQ:48 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH: Input: 18 Counted: 22 SEQ:48 f 
No. of Bases conflict, Input:19 Counted:19 SEQ:49 
(1) Wrong Nucleic Acid Designator, 6 i— 
No. of Bases differ, LENGTH: Input: 19 Counted: 23 SEQ:49 
Wrong sequence grouping, Nucleotides not in groups! 
No. of Bases differ, LENGTH : Input : 6689 Counted: 35 SEQ:55 ~~ 
Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=60 

No. of Bases conflict, Input:18 Counted:18 SEQ:60 ^ 

(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 18 Counted: 22 SEQ:60 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=61 
No. of Bases conflict, Input: 18 Counted: 18 SEQ:61 
(1) Wrong Nucleic Acid Designator, 6 — - 
No. of Bases differ, LENGTH : Input : 18 Counted: 22 SEQ:61 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=62 
No. of Bases conflict, Input:20 Counted:20 SEQ:62 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 20 Counted:24 SEQ:62 ^ 
invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=63 
No. of Bases conflict, Input: 19 Counted: 19 SEQ:63 
(1) Wrong Nucleic Acid Designator, 6 ^ 
No. of Bases differ, LENGTH: Input: 19 Counted:23 SEQ:63 
M:246 W: Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=64 



M:lll C: 
M:lll C: 
M:246 W: 
M:254 E: 
M:320 E: 
Repeated 
M:204 E: 
M:246 W: 
M:254 E: 
M:320 E: 
M:204 E: 
M:246 W: 
M:254 E: 
M:320 E: 
M:204 E: 
M:254 E: 
M:320 E: 
M:204 E: 
M:254 E: 
M:320 E: 
M:204 E: 
M:254 E: 
M:320 
M:204 
M:254 
M:320 
M:204 
M:254 
M:320 
M:204 
M:333 
M:204 E: 
M:246 W: 
M:254 E; 
M:320 E: 
M:204 E: 
M:246 W: 
M:254 E: 
M:320 
M:204 
M:246 W: 
M:254 E; 
M:320 
M:204 
M:246 W: 
M:254 E: 
M:320 E: 
M:204 E: 



E 

E: 
E: 
E: 
E; 
E; 
E: 
E; 
E; 



E: 
E: 



E: 
E: 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/765 , 061A 



DATE: 08/06/2001 
TIME: 09:19:36 



L : 
L: 
L: 
L: 
L: 



:1266 M: 
;1266 M: 
:1266 M: 
;1274 M: 
:1281 M: 
L:1281 M: 
L:1281 M: 
L:1289 M: 
L:1296 M: 
L:1296 M: 
L:1296 M: 
L:1304 M: 
L:1311 M: 
L:1311 M: 
L:1311 M: 
L:1319 M: 
L:1326 M: 
:1326 M: 
:1326 M: 
;1334 M: 
:1341 M: 
L:1341 M: 
L:1341 M: 
L:1349 M: 
L:1356 M: 
L:1356 M: 
L:1356 M: 
L:1364 M: 
L:1371 M: 
L:1371 M: 
L:1371 M: 
L:1427 M: 
L:1436 M: 
L:1436 M: 
L:1501 M: 
L:1501 M: 
L:3 M:203 



E: 
E: 



L: 
L: 
L: 
L: 



254 E: 
320 E: 
204 E: 
246 W: 
254 E: 
320 E: 
204 E: 
246 W: 
254 E: 
320 
204 
246 W: 
254 E: 
320 E: 
204 E: 
246 W: 
254 E: 
320 E: 
204 E: 
246 W: 
254 E: 
320 E: 
204 E: 
246 W: 
254 E: 
320 E: 
204 E: 
246 W: 
254 E: 
320 E: 
204 E: 
330 E: 
332 E: 
203 E: 
332 E: 
203 E: 
E: NO. 



Input Set : A:\converted sequences v2.txt 
Output Set: N:\CRF3\08062001\I765061A.raw 

No. of Bases conflict, Input: 18 Counted: 18 SEQ:64 
(1) Wrong Nucleic Acid Designator, 6 
No. of Bases differ, LENGTH : Input : 18 Counted: 22 SEQ:64 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=65 
No. of Bases conflict, Input: 20 Counted: 20 SEQ:65 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 20 Counted: 24 SEQ:65 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo-66 
No. of Bases conflict, Input: 18 Counted: 18 SEQ:66 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 18 Counted: 22 SEQ:66 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=67 
No. of Bases conflict, Input: 17 Counted: 17 SEQ:67 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 17 Counted: 21 SEQ:67 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=68 
No. of Bases conflict, Input: 19 Counted: 19 SEQ:68 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 19 Counted: 23 SEQ:68 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=69 
No. of Bases conflict, Input: 18 Counted: 18 SEQ:69 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 18 Counted: 22 SEQ:69 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=70 
No. of Bases conflict, Input: 18 Counted: 18 SEQ:70 
(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 18 Counted: 22 SEQ:70 

Invalid value of Alpha Sequence Header Field, [MOLECULE TYPE:], SeqNo=71 
No. of Bases conflict, Input: 19 Counted: 19 SEQ:71 

(1) Wrong Nucleic Acid Designator, 6 

No. of Bases differ, LENGTH : Input : 19 Counted: 23 SEQ:71 \ 

(2) Invalid Amino Acid Designator, 1 

(32) Invalid/Missing Amino Acid Numbering, SEQ ID: 72 ^ 
No. of Seq. differs, LENGTH : Input : 383 Found: 384 SEQ: 72 ^ 
(32) Invalid/Missing Amino Acid Numbering, SEQ ID: 73 
No. of Seq. differs, LENGTH : Input : 369 Found: 384 SEQ: 73 
of Seq. differs, : Input 1, Counted 78 r ^"^ 



i 



i 
i 
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